156 research outputs found

    Covariance estimation for distributions with 2+ε2+\varepsilon moments

    Full text link
    We study the minimal sample size N=N(n) that suffices to estimate the covariance matrix of an n-dimensional distribution by the sample covariance matrix in the operator norm, with an arbitrary fixed accuracy. We establish the optimal bound N=O(n) for every distribution whose k-dimensional marginals have uniformly bounded 2+ε2+\varepsilon moments outside the sphere of radius O(k)O(\sqrt{k}). In the specific case of log-concave distributions, this result provides an alternative approach to the Kannan-Lovasz-Simonovits problem, which was recently solved by Adamczak et al. [J. Amer. Math. Soc. 23 (2010) 535-561]. Moreover, a lower estimate on the covariance matrix holds under a weaker assumption - uniformly bounded 2+ε2+\varepsilon moments of one-dimensional marginals. Our argument consists of randomizing the spectral sparsifier, a deterministic tool developed recently by Batson, Spielman and Srivastava [SIAM J. Comput. 41 (2012) 1704-1721]. The new randomized method allows one to control the spectral edges of the sample covariance matrix via the Stieltjes transform evaluated at carefully chosen random points.Comment: Published in at http://dx.doi.org/10.1214/12-AOP760 the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org

    An Alon-Boppana Type Bound for Weighted Graphs and Lowerbounds for Spectral Sparsification

    Get PDF
    We prove the following Alon-Boppana type theorem for general (not necessarily regular) weighted graphs: if GG is an nn-node weighted undirected graph of average combinatorial degree dd (that is, GG has dn/2dn/2 edges) and girth g>2d1/8+1g> 2d^{1/8}+1, and if λ1≤λ2≤⋯λn\lambda_1 \leq \lambda_2 \leq \cdots \lambda_n are the eigenvalues of the (non-normalized) Laplacian of GG, then λnλ2≥1+4d−O(1d58) \frac {\lambda_n}{\lambda_2} \geq 1 + \frac 4{\sqrt d} - O \left( \frac 1{d^{\frac 58} }\right) (The Alon-Boppana theorem implies that if GG is unweighted and dd-regular, then λnλ2≥1+4d−O(1d)\frac {\lambda_n}{\lambda_2} \geq 1 + \frac 4{\sqrt d} - O\left( \frac 1 d \right) if the diameter is at least d1.5d^{1.5}.) Our result implies a lower bound for spectral sparsifiers. A graph HH is a spectral ϵ\epsilon-sparsifier of a graph GG if L(G)⪯L(H)⪯(1+ϵ)L(G) L(G) \preceq L(H) \preceq (1+\epsilon) L(G) where L(G)L(G) is the Laplacian matrix of GG and L(H)L(H) is the Laplacian matrix of HH. Batson, Spielman and Srivastava proved that for every GG there is an ϵ\epsilon-sparsifier HH of average degree dd where ϵ≈42d\epsilon \approx \frac {4\sqrt 2}{\sqrt d} and the edges of HH are a (weighted) subset of the edges of GG. Batson, Spielman and Srivastava also show that the bound on ϵ\epsilon cannot be reduced below ≈2d\approx \frac 2{\sqrt d} when GG is a clique; our Alon-Boppana-type result implies that ϵ\epsilon cannot be reduced below ≈4d\approx \frac 4{\sqrt d} when GG comes from a family of expanders of super-constant degree and super-constant girth. The method of Batson, Spielman and Srivastava proves a more general result, about sparsifying sums of rank-one matrices, and their method applies to an "online" setting. We show that for the online matrix setting the 42/d4\sqrt 2 / \sqrt d bound is tight, up to lower order terms

    Content Based Document Recommender using Deep Learning

    Full text link
    With the recent advancements in information technology there has been a huge surge in amount of data available. But information retrieval technology has not been able to keep up with this pace of information generation resulting in over spending of time for retrieving relevant information. Even though systems exist for assisting users to search a database along with filtering and recommending relevant information, but recommendation system which uses content of documents for recommendation still have a long way to mature. Here we present a Deep Learning based supervised approach to recommend similar documents based on the similarity of content. We combine the C-DSSM model with Word2Vec distributed representations of words to create a novel model to classify a document pair as relevant/irrelavant by assigning a score to it. Using our model retrieval of documents can be done in O(1) time and the memory complexity is O(n), where n is number of documents.Comment: Accepted in ICICI 2017, Coimbatore, Indi

    Factors influencing bank deposits : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy (PhD) in Banking at Massey University, Palmerston North, New Zealand

    Get PDF
    This thesis comprises three essays that investigate the effects of human capital, financial markets, and the banking system development on bank deposits, deposit funding, retail, and time deposits proportions. The first two essays are country level studies, whereas the third is at bank level. The data related to first essay has been obtained from the World Bank and the World Health Organisation (WHO). For the second and third essays, bank level data is from Bankscope and macroeconomic variables data are from the World Bank. The first essay investigates the effects of human capital development on bank deposits, employing 2SLS method in a cross-country setup. Human capital development includes the development of the healthcare system and education level. I use two dependent variables: deposits to GDP ratio and value of total deposits. Results show a positive relationship between human capital development and bank deposits. However, the impact of healthcare system on total deposits is higher than the bank deposits to GDP ratio, suggesting that an improvement in the healthcare system increases households’ income and a proportion of that increased income goes into the banking system. The impact of education is higher in high financially included countries than in less financially included countries. The second essay examines the effects of financial markets development on bank deposits, using instrumental variables methods. Empirical results suggest that investors in developed and developing economies use financial markets differently. In highly financially integrated economies, the financial markets and banking system complement each other, whereas in fragmented markets they compete. The third essay explores the effects of competition on bank deposit funding and composition. Interest cost has been used to measure deposit competition and the Herfindahl- Hirschman Index (HHI3) at deposits and loans levels to measure market structure. The results show that increased deposit competition encourages banks to increase the proportion of less costly funds, causing a reduction in deposit funding. In contrast, high interest rates attract retail depositors, especially for time deposits, thereby increasing the proportion of retail deposits. However, this finding varies according to the financial development level of the countries. Market concentration shows negative effects on bank deposit funding and composition

    Twice-Ramanujan Sparsifiers

    Full text link
    We prove that every graph has a spectral sparsifier with a number of edges linear in its number of vertices. As linear-sized spectral sparsifiers of complete graphs are expanders, our sparsifiers of arbitrary graphs can be viewed as generalizations of expander graphs. In particular, we prove that for every d>1d>1 and every undirected, weighted graph G=(V,E,w)G=(V,E,w) on nn vertices, there exists a weighted graph H=(V,F,w~)H=(V,F,\tilde{w}) with at most \ceil{d(n-1)} edges such that for every x∈RVx \in \R^{V}, xTLGx≤xTLHx≤(d+1+2dd+1−2d)⋅xTLGx x^{T}L_{G}x \leq x^{T}L_{H}x \leq (\frac{d+1+2\sqrt{d}}{d+1-2\sqrt{d}})\cdot x^{T}L_{G}x where LGL_{G} and LHL_{H} are the Laplacian matrices of GG and HH, respectively. Thus, HH approximates GG spectrally at least as well as a Ramanujan expander with dn/2dn/2 edges approximates the complete graph. We give an elementary deterministic polynomial time algorithm for constructing HH
    • …
    corecore